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ABSTRACT 

Group norms are provided for the California Critical 
Thinking Skills Test (CCTST) — College Level, a standardized 34-item 
multiple-choice test designed "o assess the core critical thinking 
skills associated with baccalaureate general education. The CCT.°T 
offers three subtests conceptualized in terms of a national Delphi 
study on critical thinking. These three subtests — analysis, 
evaluation, and inference — correlate strongly with each other and the 
overall CCTST when used as either a pretest or posttest. Subtests are 
also offered based on the more traditional division of reasoning into 
"deductive reasoning" and "inductive reasoning." These latter two 
subtests also correlate strongly with each other and the overall 
CCTST when used as either a pretest or posttest. Statistical 
analyses , correlations, and recommended percentile rankings for raw 
scores are presented in nine tables. These norms were developed on 
the basis of analyses of 1,673 test forms for representative samples 
of college students at a comprehensive urban state university during 
the 1989-90 school year. (SLD) 
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Abstract 

Technical Report #4, in a series of four, provides group norms for the 
California Critical Thinking skills Tests College Level, a standardized 
testing instrument designed to assess the core critical thinking skills 
associated with baccalaureate general education. The CCTST offers three sub- 
tests conceptualized in terms of the recently completed national Delphi study, 
Critical Thinking 2 ft Statement of Expert Consensus for Purposes of Educational 
Assessment and Instruction . These three sub-tests, "Analysis, " "Evaluation, " 
and "Inference," correlate strongly with each other and with the overall 
CCTST. The CCTST also offers sub-tests based on the more traditional division 
of the reasoning arts into "Deductive Reasoning" and "Inductive Reasoning." 
Complete statistical analyses, correlations and recommended percentile 
rankings for raw scores on each of the five sub-test* as well as for the CCTST 
overall , used either in a pretest or posttest context , are presented in 
tabular ized form in this technical report. These norms have been developed on 
the basis of analyses of 1673 test forms completed by representative samples 
of college students during the 1989/90 academic year at a comprehensive urban 
state university. Technical Report #1 in this series reports on the content 
validity of the CCTST and its experimental validation during 1989/90. 
Technical Report #2 describes the concurrent validity of the CCTST in terms of 
its correlations with SAT-verbal, SAT-math, college GPA, and Nelson-Denny 
Reading Test scores. Technical Report #3 reports on the relationship between 
CCTST and four student-related variables: gender, ethnicity, academic major 
and CT self-esteem. 



The California Critical Thinking Skills Test: College Level 
Technical Report #4 - 

Interpreting the CCTST, Group Norms and Sub-Scores 

by 

Peter A. Facione 
Santa Clara University 

Recap of Previous Findings 

This Technical Report, the fourth and final in this series, provides detailed 
statistical information on the five CCTST sub-tests. Three sub-tests are conceptualized in 
terms of the recently completed national Delphi study, Critical Thinking: A Statemen t flf 
Expert Consensus far Purposes of Educational Assessment and Infraction (Facione, 1990 
a). These three sub-tests, "Analysis," "Evaluation," and "Inference," correlate strongly with 
each other and with the overall CCTST, used as either a pretest or a posttest. The same is 
true of the two CCTST sub-tests, "Deductive Reasoning" and "Inductive Reasoning," which 
divide CCTST items along that more traditional matrix. Recommended percentile 
rankings for raw scores on each of the five and for the CCTST overall - used either as 
pretests or posttests - have been developed. The statistical analyses which form the basis 
for these recommendations were conducted on the 1673 CCTST test forms completed by 
representative samples of college students enrolled in campus approved critical thinking 



courses and control group courses during the l°89/90 academic year at a comprehensive 
urban state university. 

Technical Report #1 in this series discussed the content validity of the CCTST in 
terms of the conceptualization of CT expressed in Critical Thinking: A Statement of 
Expert Consensus for Purposes of Educational Assessment and Instruction as well as the 
concept of CT grounding the system-wide CT general studies requirement of the 
California State University. Also, Technical Report #1 described a series of four 
experiments which indicated that the CCTST is an effective measure of the improvements 
in the core CT skills of interpretation, analysis, evaluation, inference and explanation 
which occur as a result of taking a lower division college level CT course. During 1989/90, 
data was collected on a variety of variables relating to the 20 instructors and the 1196 
college students who participated in these experiments. Those studied were either 
teaching or enrolled in 45 sections of five different courses offered by three departments, 
(Facione, 1990 c). 

Technical Report #2 described the relationship of CCTST results to a number of 
student-related and instructor-related variables. Critical thinking skills, as measured on 
the CCTST, can be predicted by a combination of SAT verbal, SAT math, and GPA data 
with R-square =.41 If CCTST pretest data are included in the regression model the 
R-square =.71. A college student's age, units of college work completed, and high school 
subject matter preparation, and an instructor's teaching experience do not contribute 
significantly to the regression models which predict CCTST posttest results. CCTST 
results positively correlated with Nelson-Denny reading scores for vocabulary, 
comprehension, and total score. Non-native English speakers show virtually no gain from 
CCTST pretest to posttest and, hence, use of the CCTST for non-native English speaking 
students is counter-indicated. Of six instructor-related factors which are thought to be 



related to effectiveness in teaching CT skills, only years of teaching experience and recent 
experience teaching CT are related, and these in non-linear ways. No evidence was found 
to support the hypothesis that CT skill development is a natural outcome of baccalaureate 
education, either in general, or by reference to the control groups, (Facione, 1990 d). 

Technical Report #3 examined the CCTST in terms of the possible impact of 
student gender, ethnicity, academic major and CT self-esteem on CT skill performance. 
Analyses of pretest data and control group data show that the CCTST is not gender- 
biased. Statistically significant gender differences emerge only after students complete 
their college level CT course. ANCOVA also indicated that the CCTST does not fa /or or 
disadvantage any particular ethnic or racial group. However, not all groups appeared to 
benefit equally from having completed their approved college level CT course. While 
academic major was not a significant factor on the CCTST pretest, scores on the posttest 
did vary significantly by major. Student CT self-confidence, which appears unrealistically 
high, did correlate with relative success on the CCTST. However, when SAT and native 
language were controlled, CT self-confidence was not a significant factor in explaining 
pretest or posttest results. The emergence of significant differences by gender, ethnicity 
and major on the CT posttests indicated an urgent need for research on student learning 
relative to CT curriculum and CT pedagogy, (Facione, 1990 e). 



CCTST Pretest and Posttest Norms 

In its final form the CCTST is a standardized 34 item multiple choice assessment 
tool. Twenty of the questions oher four choices, fourteen offer five. For purposes of CT 
skill assessment, one answer has been designated the superior choice on each question. 
All distractors ("wrong" answers) were selected by some subjects in the CCTST validation 



studies during 1989/90 as well as in the prior years of individual item pilot testing. 



To establish stable pretest and posttest norms the largest possible number of 
subjects was used. Pretest norms are based on the responses of 781 college students who 
completed the CCTST as a pretest in Feb. 1990 during week one or two of an approved 
CT course or who completed the CCTST as either a pretest or posttest in the control 
group (non-CT) coarse.* Posttest norms are based on the responses of 892 college 
students who completed the CCTST in Nov. 1989 or May 1990 during week 14 or 15 of a 
three semester unit college level course approved as meeting a campus general studies CT 
requirement. Table 1 displays pretest and posttest statistics. 



Of the 1673 tests evaluated, the top score achieved was a posttest 31 and the lowest 
a pretest 2. There is room for group movement both above and below both means as well 
as beyond the outliers of both the pretest and posttest. The statistics on Table 1 and the 
histographic representation of the curves produced on the pretest and on the posttest on 
Table 2 indicate that both curves are sufficiently normal. 

Table i 

Statistical Analysis of Pretest and Posttast Groupings 
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Normal Curves for Pretest and Posttest Groupings 
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Table 3 represents the percentiles recommended to be associated with each raw 
score for the pretest and for the posttest. For example, a student who answers 20 correctly 
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on the pretest would rank in the 86th percentile. The same number correct on the posttest 
would rank in the 75th percentile. This drop in percentile ranking is to be expected 
because of the measurable improvement in the group's CT skills. 

Xnblft 1 

Percentile Ranklnrjs for Pretest and Posttest Raw Scores 
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To insure a more accurate representation of the population of college students at 
comprehensive public universities, the norms and percentile recommendations presented 
in this technical report include the scores of native and non-native English speaking 



students. To the extent a given group of persons tested with the CCTST might differ from 
this norm group on factors predictive of CT skills CCTST users should consider making 
local modifications in the recommended percentiles. For details consult Technical 
Reports #2 and #3, (Facione, 1990 d, e). 2 



The Analysis, Evaluation, and Inference Sub-Tests 

The items on the CCTST can be divided along either of two theoretical matrices. 
The first, developed out of the Delphi research, sub-divides the entire CCTST, all 34 
items, into three distinct groupings: Analysis, Evaluation and Inference. The second, 
using a more traditional conceptualization, sub-divides 30 the 34 CCTST items into two 
groupings: Deductive Reasoning and Inductive Reasoning. 

Using the Delphi matrix, items 1-9 fall under the sub-score named "analysis" and 
relate to the core CT skills of interpretation and analysis. Items 10-13 and 25-34 are 
grouped under the sub-score of "evaluation" and relate to the core CT skills of evaluation 
and explanation. Items 14-24 are grouped under "inference" and relate to the core CT skill 
of inference, (Facione, 1990 c). Thus, on the Delphi matrix each item is included on one, 
and only one, sub-test; nine ijftms are used for "analysis," fourteen for "evaluation," and 
eleven for "inference.""* 

Table 4 indicates the correlations between each sub-score and the two others, as 
well as between each and the overall pretest and posttest scores. Table 5 displays 
statistical data for each of the three sub-tests and Table 6 indicates the recommended 
percentile rankings for each in both the pretest and posttest contexts using raw scores. 
Discretion is recommended in the use of sub-test results and percentile rankings. Use 
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should be restricted to diagnostic purposes, the evaluation of CT programs, or the 
assessment of aggregate groups of students, in contrast to summative evaluations made of 
individual persons. 4 
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Correlations of Delphi Matrix Sub-Tests 
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For example, to find the corrciatioo bctweca a aub-acore oa Tarereace' and the CCTST oveimll 
used in a posttest context, read •POSTTEST" across to the two coIvsuk om the right. The 
correlation of pretest inference tu b acorea it 3160 aad the u.a ultra of* the p o a ttcal inference 
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pretest overs! score whereas the posttest sub-scores corressk aaore strongly with the CCTST 
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m tca ecti o n of the sp ptopra te rowsndcohunn. 
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Statistical Analyses of Delphi Matrix Sub-Tasts 



Analysis 



Infaranca 





PRE -ANAL 


POST- ANAL 


PRB-KVAL 


POST- BY AL 


PRB-INPR 


POST-INFR 




4.454 


4.766 


5 406 


6 • 178 


6 141 


6 .349 


g^d Brr 


.064 


• 052 


• 100 


.090 


.083 


.069 


Madian 


t .000 


5 .000 

«* . www 


5.000 


6 000 

V . W WW 


6 000 

V . www 


4 .000 


Moda 


5.000 


5.000 


6.000 


6 000 

V . W WW 


7 000 

# a WWW 


6 .000 


Std Day 


1.558 


1.536 


2 .449 


2.662 


2.028 


2 .052 


Varianca 


2.428 


2.361 


5.998 


7.088 


4.112 


4.211 


Kurtoaia 


-.023 


-.256 


-.165 


-.414 


-.210 


-.197 


S 1 fty& 


.199 


.165 


.199 


.165 


.199 


.165 


Skawnaas 


-.190 


-.070 


.260 


.202 


-.240 


-.076 


S I Skts 


.100 


.083 


.100 


.083 


.100 


.083 


Biast 


9.000 


9.000 


13.000 


13.000 


11.000 


11.000 


Minimi* 


.000 


.000 


.000 


.000 


.000 


.000 


Maximum 


9.000 


9.000 


13.000 


13.000 


11.000 


11.000 


Sim 


2677.0 


4156.0 


3249.0 


5387.0 


3691.0 


5536.0 


Valid 


601 


872 


601 


872 


601 


872 



Rao 



Ptflpbaf 

o 
i 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 



ZAkil 4 

andad Parcantilaa for Dalphi Matrix Sub-Taata Raw Scoraa 

ftailllil BTily*U9Q Infaranca 

PRE -ANAL POST-ANAL PRE-EVAL POST-EVAL PRE-INFR POST-INPR 



1 
4 
10 
26 
49 
75 
92 
98 
99 
99 



0 
2 
7 
21 
42 
69 
87 
97 
99 
99 



1 
5 
11 
23 
37 
52 
70 
81 
89 
94 
98 
99 
99 
99 
99 



0 
3 
8 
16 
28 
42 
56 
69 
81 
88 
94 
97 
99 
99 
99 



0 
1 
3 
8 
18 
34 
54 
71 
84 
94 
98 
98 



10 



12 



Deductive and Inductive Reasoning Sub-Tests 



The traditional way of dividing the domain of reasoning is by distinguishing 
deduction and induction. These concepts, however, have become notoriously ambiguous 
as a result of important differences in what they denote in different disciplines. Even the 
notion that the one "goes from general to specific H and the other from "specific to generaT 
has been discredited both theoretically and by counter-examples from the days of Russell 
and Whitehead on to the present. It is truly regrettable that this dysfunctional, nineteenth 
century notion can still be found in some recent methodology texts. If anything, however, 
this alerts us to be suspicious of any suggestion that the inductive/ded active distinction is 
well understood or even similarly understood across academia. Concern about this 
ambiguity explains why the words "deduction" and "induction" appear nowhere in the 

ccrsT. 



However, since the difference between deductive reasoning and inductive 
reasoning, as that distinction is understood among logicians, is both powerful and useful, 
the CCTST offers sub-scores in each. Logicians draw this distinction on the basis of the 
purported logical strength of the inference. If th£ assumed truth of the premises 
purportedly necessitates the truth of conclusion, then the argument is classified as 
deductive. Not only do traditional syllogisms fall within this category, but algebraic, 
geometric, and set-theoretical proofs in mathematics (including "mathematical induction") 
also represent paradigm examples of deduction. Instantiation of universalized 
propositions is deductive, as are inferences based on such principles as transitivity, 
reflexivity and identity. In the case of valid deductive arguments, it is not logically possible 
for the conclusion to be false ill the premises to be true. 



ERLC 



11 

13 



By contrast, if an argument's conclusion is purportedly warranted, but not 
necessitated, by the assumed truth of its premises, logicians would consider the argument 
inductive. Scientific confirmation and experimental disconfirmation are examples of 
inductive reasoning. The day to day inferences which lead us to infer that in familiar 
situations things are most likely to occur or to have been caused as we have ~ome to 
expect are inductions.^ Statistical inferences are inductive, even if the inference is the 
prediction of an extremely probable specific (rain today) based on general principles 
(meteorological laws) and a given set of observations. Inference used to inform judgment 
by reference to perceived similarities or applications of examples, precedents, or relevant 
cases, such as is typical of legal reasoning, is inductive. Also inductive is that common and 
powerfully persuasive - even if logically suspicious tool of everyday dialogue, analogical 
reasoning. In the case of a strong inductive argument it is unlikely or improbable that the 
conclusion would actually be false and all the premises true, but it is logically possible that 
it might7 

Thirty of the items on the CCTST can be readily classified as requiring the proper 
application of either deductive reasoning or inductive reasoning for the designated answer 

o 

to be selected. The CCTS X . thus, offers sub-scores on deductive reasoning and inductive 
reasoning, as those two terms were described above. Table 7 indicates the correlations 
between the two sub-score and between each and the overall CCTST pretest and posttest 
scores. Table 8 displays statistical data for each of the two sub-tests and Table 9 indicates 
the recommended percentile rankings for each in both the pretest and posttest contexts 
using raw scores. Discretion is again recommended in the use of sub-score resu'ts and 
percentile rankings. Use should be restricted to diagnostic purposes, or to summative 
evaluations of modes of instruction, CT programs or aggregate groups of students, in 
contrast to summative evaluations of individuals. 
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Correlations of Traditional Matrix Sub-Tssts 
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Statistical Analysas of Traditional Matrix Sub-Tasts 
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Inductiva Raaaonino 
PRB-IWPV POST-IHPV 



Maan 


7.689 


8.369 


6.512 


7.018 


Std Err 


.109 


.096 


.103 


.086 


Madian 


8.000 


8.000 


7.000 


7.000 




8.000 


9.000 


7.000 


8.000 


Std Dav 


2.660 


2.827 


2.533 


2.550 


Variance 


7.078 


7.992 


6.417 


6.505 


Knrtoaia 


-.428 


-.441 


-.241 


-.505 


S 1 Earl 


.199 


.165 


.199 


.165 


Skatmau 


.020 


.167 


-.016 


-.093 


fi 1 fikts 


.100 


.083 


.100 


.083 


Ranaa 


15.000 


15.000 


14.000 


12. C JO 


Minlmim 


.000 


1.000 


.000 


1.000 


Maxi»u» 


15.000 


16.000 


14.000 


13.000 


Sim 


4621.0 


7298.0 


3914.0 


6120.0 


V«U4 f*MI 


601 


872 


601 


872 
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d*d Percentiles for Traditional Matrix Sub-Tests Raw Scores 

Deductive Reasoning Inductive Eeesoninc 

PRM-DBPU POST-DTOU HUB POSg-IMPO 

Corract 

0 0 0 1 0 

1 10 2 1 

2 2 1 7 5 

3 6 3 13 9 

4 12 7 22 17 

5 22 XI 34 28 

6 33 28 48 42 

7 47 40 65 56 

8 62 53 79 71 

9 74 67 88 82 

10 85 77 94 91 

11 92 85 98 96 

12 97 92 99 99 

13 99 96 99 99 

14 99 98 S9 99 

15 99 99 

16 99 99 



Critical Thinking Dispositions 

The CC1ST is designed to assess CT skills, however, the proper exercise of those 
skills presupposes certain crucial CT dispositions. Indeed, the CCTST includes items 
constructed with distractors (wrong answers) hypothesized to be more attractive to persons 
who do not possess the appropriate CT dispositions. Items 5, 9, 19, 20, 24-33, for example, 
include distractors intended to be attractive to persons who lack the dispositions identified 
in the Delphi study under the category of "approaches to specific issues, questions or 
problems, 1 * (Facione, 1990 a). Specifically, 4 these dispositions include: 

* clarity in stating the question or concern, 

* orderliness in working with complexity, 
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* diligence in seeking relevant information, 

* reasonableness in selecting and applying criteria, 

* care in focusing attention on the concern at hand, 

* persistence through difficulties are encountered, and 

* precision to the degree permitted by the subject and the circumstances. 

Likewise, items 20, 24-26, and 32-34 include other distractors which, it is 
hypothesized, are more likely to be selected by persons who have not developed certain of 
the dispositions which the Delphi research classifies under the heading of "approaches to 
lite and living in general." Those related to the CCTST include: 

* trust in the processes of reasoned inquiry, 

* open-mindedness regarding divergent world views, 

* flexibility in considering alternatives and opinions, 

* understanding of the opinions of other people, 

* fair-mindedness in appraising reasoning, and 

* prudence in suspending, making, or altering judgments. 

An interesting extension of this research would be to cluster such CCTST items into 
a sub-test on H CT-dispositions. H The designated responses to items on such a sub-test 
would be all those choices which, it would be hypothesized, might be selected by students 
who approach the item with the requisite CT dispositions regardless of whether they apply 
their CT skills correctly. Or, L other words, the "wrong" answers to items on such a sub- 
test would be those distractors which would most likely be selected only by people who are 
hypothesized not to have appropriate CI* dispositions. Naturally, to fully validate such a 
sub-test it would be necessary to conduct the kind of interviews which Steven Norris (1989) 
describes in his research regarding construct validation. 



ERJC i5j 



Conclusion 



The CCTST offers three sub-tests conceptualized in terms of the Delphi study, 
Critical Thinking: A Statement of Expert Consensus for Purposes of Educational 
Assessment and Instruction . These three sub-tests, "Analysis," "Evaluation," and 
"Inference," correlate strongly with each other and with the overall CCTST, used as either 
a pretest or a posttest. The same is true of the two CCTST sub-tests which divide along 
the traditional matrix, "Deductive Reasoning" and "Inductive Reasoning." Recommended 
percentile rankings for raw scores on each of the five and for the CCTST overall - used 
either as pretests or posttests - have been developed on the basis of analyses of 1673 test 
forms completed by representative samples of college students during the 1989/90 
academic year at a comprehensive state university. 
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Endnotes 

* Control group students group means were not statistically significantly different than the control group pretest mean, hence their 
scores were used to supplement the sac of the sample used to establish these pretest norms. 

2 

Relevant data on SAT scores and college GPA for the norm groups was presented in TR #2, "Factors Predictive of CT Skills". 

^ Using the names of the core CT skilk identified in the Delphi research, items 1-6 taijet interpretation, items 6-9 analysis, items 10-13 
evaluation, items 14-24 inference, and items 25-34 explanation. 

* Data on possible differences by gender, ethnicity, or academic major on the various sub-tests is yet to be analyzed. Likewise the 
relationships of various sub-test scores to other indicators such as SAT or GPA has yet to be determined. 

* Statistical t'gnirVancc reported here (p<.001) was obtained using the one-tailed test for the significance of Picrson-rho. 

6 Sherlock Holmes used induction but called tt deduction. 
7 

The Rvvdoocdia of Philoaophv. which is authoritative in such matters, defines "de d u ct ion' as "a form of inference such that in a valid 
deductive argument the joint assertion of the presmises and the denial of the condianoo is a contradiction." The reference goes on to 
contrast inferences of the deductive variety with those in which that joint assertion would not be contradictory and includes inductive 
inferences as amrw> g this latter group. 

8 The sixteen deductive items are: 1, 2, 4, 5, 11, 12, 13, 14, 15, 16, 17, 18, 19, 22, 23, and 3a The fourteen inductive items are: 9, 10, 20, 
21, 24, 25, 26, 27, 28, 29, 31, 32, 33, and 34. The excluded items are: i, 6, 7, and & 
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